ORIGINAL RESEARCH



# Variations-tolerant low power wide fan-in OR logic domino circuit

Ankur Kumar<sup>1</sup><sup>™</sup> · Naman Garg<sup>1</sup> · Devvrat Tyagi<sup>2</sup> · R. K. Nagaria<sup>3</sup>

Received: 9 June 2022 / Accepted: 14 October 2022 © The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management 2022

**Abstract** In this research, a novel strategy for reducing delay and power variations, and power dissipation in wide fan-in domino OR circuits with better noise immunity is proposed. Primarily, bulk-driven keeper transistor is used to decrease its own transconductance which reduces the delay and power variations, and removes the minimum thresholdvoltage limitation. In addition, an arrangement named as keeper controlling network is developed for efficient controlling of keeper so that power dissipation can be reduced and noise immunity can be enhanced. As a result, the proposed domino circuit can be used in a wide range of fan-in designs. The proposed domino extensively performed against the process corner, voltage and temperature effect on fan-in (8, 16, 32, 64-bit) for OR logic implementations to check the reliability and robustness. The simulation values approve that the designed domino circuit displays about 31% and 25% reduction in variations of power and delay respectively, and 1.56 times noise immunity improvement in contrast with conventional domino. All domino circuits in this work are designed and simulated with SPECTRE simulator under the

Ankur Kumar ankur@iiitu.ac.in

Naman Garg namangarg@iiitu.ac.in

Devvrat Tyagi devvrat.tyagi@abes.ac.in

R. K. Nagaria rkn@mnnit.ac.in

<sup>1</sup> Indian Institute of Information and Technology Una, Una, Himachal Pradesh 177209, India

- <sup>2</sup> ABES Engineering College, Ghaziabad 201009, India
- <sup>3</sup> Motilal Nehru National Institute of Technology Allahabad, Allahabad, Uttar Pradesh 211004, India

environment of Cadence Virtuoso using the 45 nm CMOS technology.

Keywords Low power VLSI circuits  $\cdot$  Process variations  $\cdot$  Noise immunity  $\cdot$  High speed  $\cdot$  Wide fan-in domino circuit

## **1** Introduction

Wide fan-in dynamic circuits are getting attention for high speed applications to implement the basic building blocks such as read path of modern microprocessors, resister files, arithmetic unit, DSP, flash memory, and tag comparators [1-3] etc. Dynamic circuits have higher speed and smaller chip area over static circuits [1]. Despite of these advantage, low signal integrity and Noise Immunity (NI) are two major drawbacks of dynamic circuits [4]. Therefore, keeper transistor and a static (INV) inverter are connected in a feedback loop having at dynamic node to enhance the NI and signal integrity as shown in Fig. 1 [4, 5], such a combination is called domino circuit. Domino circuits suffers from excessive Power Dissipation (PD) [6] because of the formation of feedback loop [4]. Scaling of devices is a better technique to reduce the PD [7] but scaling brings other issues as processes variations [8, 9] and speed loss [6]. The threshold voltage  $(V_{TH})$  can be reduced to overcome the speed loss but subthreshold leakage current [10] increases exponentially at the same time in Pull Down Network (PDN) which results in increase the static PD and reduces the NI [11] as fanin increases. The constant technology scaling contributes enhanced performance parameters but on the other hand, process and environment variations in performance parameters like delay, PD, NI etc. become a major challenge at circuit level [8, 9].



Fig. 1 CDC a without footer transistor, b with footer transistor [3, 4]

According to the nature, process variations can be categorized mainly in two kinds i.e. inter-die and intradie. Intra-die variation generally increases sharply with technology upgrade, which degrades the performance parameters. Thus, intra-die variation is major concern for robust design implementation of wide fan-in domino OR logic circuits in nanometer technologies [12, 13]. Alioto et al. [13] have found out the effect of process variations at circuit level of abstraction. Delay and power variability in domino circuit is almost double of dynamic circuit (without feedback) because of feedback loop. Thus, process variations, subthreshold leakage current as well as NI are key alarm for the design of variation and noise tolerant low power domino circuit for wide fan-in.

In Conventional Domino Circuit (CDC) [4] as shown in Fig. 1, the stored charge in output capacitor is considered as the output logic. The CDC [4] consists a precharge transistor, an evaluation network and a feedback loop to preserve the truthful logic at dynamic node [14, 15]. The PD in domino circuit is measured with the help of mathematical Eq. (1) [16, 18];

$$P_{avg} = P_{switching} + P_{short-ckt} + P_{leakage} \tag{1}$$

where,  $P_{switching}$ ,  $P_{short}$ , and  $P_{leakage}$  is PD due to charging and discharging of output node, PD for a direct (short) path established between supply voltage ( $V_{dd}$ ) and ground, and PD because of the leakage current within the devices respectively [7]. In today's scenario, leakage currents in the MOS are increasing rapidly due to the scaling of the devices. Subthreshold leakage current is a leading concern compared to all leakage components in domino because it has large amount of the total leakage current, as given in Eq. (2) [14–23];

$$I_{sub-th} = I_0 \left( 1 - e^{\frac{-V_{ds}}{V_{th}}} \right) \left( e^{\frac{-V_{gs} - V_{th} - \eta V_{ds}}{\eta V_{t0}}} \right),$$
(2)

where  $V_{gs}$ ,  $V_{ds}$ ,  $V_{th}$ ,  $V_{t0}$ ,  $\eta$ , n,  $I_0$ ,  $\mu_0$ , and  $C_{ox}$  are voltages from gate (G) to source (S), drain (D) to source (S), threshold voltage, thermal voltage, DIBL coefficient, subthreshold swing coefficient, reverse saturation current, transistor mobility at zero (0) biasing, and oxide capacitance, respectively.

Keeper transistor and evaluation transistor plays a major role to control the process variations [1, 24–34] and NI [14–23]. As a result, the most impacted relationship between keeper and evaluation transistors is characterised by the Keeper Ratio (KR), as shown in Eq. (3) [1, 14–34];

$$KR = \frac{\mu_p \left(\frac{W}{L}\right)_{keeper}}{\mu_n \left(\frac{W}{L}\right)_{evalution}}$$
(3)

where hole and electron mobility is defined as  $\mu_p$  and  $\mu_n$  respectively. The ratio of (transistor size) width to length is known as W/L.

Many domino circuits are redesigned and proposed in previous studies to overcome these problems. In which, Refs. [14–23] have improved NI and reduced Power Delay Product (PDP). However, performance of these reported domino circuits is inadequate by virtue of excess or at least same delay and power variations compared to CDC [4]. On the other hand, Refs. [1, 24–34] are oriented to decrease the delay and power variations but are not capable to reduce PDP. Some domino circuits from literature work are elaborated in tabulated form to perform meta-analysis of available work and establish ground for proposed work.

| Ref. | Circuit name                                                 | Methodology and outcomes                                                                                                                                                                                                                                                                                                      | Limitations                                                                                                                                                                                                                                                                         | Ref. | Circuit name                                    | Methodology and outcomes                                                                                                                                                                                                                                                                                          | Limitations                                                                                                                                                                                                                                |
|------|--------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|-------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [20] | High Speed<br>Domino Cir-<br>cuit (HSDC)                     | <ul> <li>Keeper<br/>transistor is<br/>controlled<br/>efficiently<br/>in precharge<br/>and evalu-<br/>ation phase<br/>to reduce<br/>the dynamic<br/>PD and to<br/>improve the<br/>NI</li> <li>HSD resolves<br/>the tradeoff<br/>between<br/>performance<br/>and reliability<br/>using multi-<br/>threshold</li> </ul>          | <ul> <li>HSDC suffers<br/>of high delay<br/>variations<br/>as fan-in<br/>increases</li> <li>It also lacks<br/>because of the<br/>drawback of<br/>pass transis-<br/>tor, which is<br/>used to transfer<br/>output logic to<br/>gate terminal<br/>of keeper<br/>transistor</li> </ul> | [23] | Low Power<br>Dynamic<br>Circuit<br>(LPDC)       | <ul> <li>PD is<br/>dropped by<br/>decreasing<br/>the volt-<br/>age swing<br/>of dynamic<br/>node</li> <li>Further, a<br/>reference<br/>inverter is<br/>integrated in<br/>evaluation<br/>network to<br/>control the<br/>switching of<br/>footer transis-<br/>tor to reduce<br/>subthreshold<br/>leakage</li> </ul> | <ul> <li>It is having<br/>low NI and<br/>high process<br/>variations due<br/>to reduction of<br/>voltage swing<br/>at DN</li> <li>Speed is also<br/>scaled down<br/>a result of<br/>stacking effect<br/>in discharging<br/>path</li> </ul> |
| [21] | Leakage Con-                                                 | voltage<br>concept<br>• Keeper                                                                                                                                                                                                                                                                                                | • It is very dif-                                                                                                                                                                                                                                                                   | [29] | Variable<br>Threshold<br>Voltage<br>Keeper dom- | <ul> <li>Subthreshold<br/>leakage and<br/>variations<br/>in delay</li> </ul>                                                                                                                                                                                                                                      | • It is limited<br>due to incre-<br>ment in delay<br>because of                                                                                                                                                                            |
|      | trolled Rep-<br>lica Domino<br>Circuit<br>(LCRDC)            | <ul> <li>transistor is<br/>controlled by<br/>a reference<br/>analog mir-<br/>ror to track<br/>all process<br/>corners</li> <li>Moreover,<br/>sizing of mir-<br/>ror transis-<br/>tor is done<br/>in efficient<br/>way that<br/>subthreshold</li> </ul>                                                                        | ficult to size<br>the mirror tran-<br>sistor<br>• In case of<br>high fan-in,<br>it is found<br>that leakage<br>current and the<br>static PD are<br>exorbitant                                                                                                                       |      | ino circuit<br>(VTVKC)                          | <ul> <li>and power<br/>are reduced<br/>simultane-<br/>ously in this<br/>design</li> <li>The thresh-<br/>old voltage of<br/>keeper is var-<br/>ied according<br/>the working<br/>phase to scale<br/>down the<br/>variations</li> </ul>                                                                             | <ul><li>the body bias<br/>generator at a<br/>phase of clock<br/>cycle</li><li>It requires an<br/>extra power<br/>supply to vary<br/>the threshold<br/>voltage</li></ul>                                                                    |
|      |                                                              | leakage cur-<br>rent might be<br>minimized                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                     | [30] | Self-Calibrat-<br>ing Process<br>Compensat-     | • To bring<br>down the<br>process                                                                                                                                                                                                                                                                                 | • The increased area is the negative                                                                                                                                                                                                       |
| [22] | Voltage<br>Comparison<br>based Dom-<br>ino Circuit<br>(VCDC) | <ul> <li>Subthreshold<br/>leakage is<br/>turned down<br/>by isolating<br/>the output<br/>node to the<br/>dynamic<br/>node</li> <li>Further, sen-<br/>sor network<br/>is added<br/>to decide<br/>the output<br/>logic based<br/>on voltages<br/>across PDN<br/>resulting a<br/>significant<br/>reduction in<br/>PDP</li> </ul> | <ul> <li>The gate of transistor M4 is floating in the evaluation phase which decreases the NI</li> <li>M6 is used only to increase the stacking effect, thus area and PD increase</li> </ul>                                                                                        |      | ing Dynamic<br>circuit<br>(PCDC)                | variations, a<br>self-calibrat-<br>ing (PCD)<br>Process<br>Compensat-<br>ing Dynamic<br>technique<br>is proposed<br>which<br>restores the<br>robustness of<br>circuit<br>• In worst<br>case, keeper<br>strength<br>can also be<br>adjusted to<br>minimize the<br>leakage                                          | criteria of this<br>circuit design<br>approach<br>• This technique<br>is limited<br>because com-<br>plexity                                                                                                                                |

| Ref. | Circuit name                                                                 | Methodology and outcomes                                                                                                                                                                                                                                                                                                             | Limitations                                                                                                                                                                                        | Ref.                                                                      | Circuit name                                                                                                                                             | Methodology and outcomes                                                                                                                                                                                                                                                                                                                                               | Limitations                                                                                                                                                                                                      |
|------|------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [1]  | Variation-Tol-<br>erant Keeper<br>Domino<br>Circuit<br>(VTKDC)               | <ul> <li>The idea is to find out a graphical representation of the trade-off between NI and process variations</li> <li>Therefore, a replica bias technique is used to keep circuit independent from random variations (e.g., random dopant fluctuation, temperature, voltage, leakage current</li> </ul>                            | <ul> <li>PD dissipation<br/>is increased<br/>because of the<br/>sensor and<br/>extra keeper<br/>transistor</li> <li>It suffers from<br/>large charge<br/>sharing same<br/>as in CDC [4]</li> </ul> | [33]                                                                      | Clock Delayed<br>Dual Keeper<br>Domino<br>Circuit<br>(CDDKDC)                                                                                            | <ul> <li>An extra<br/>PMOS is<br/>stacked in<br/>series with<br/>keeper tran-<br/>sistor, which<br/>is controlled<br/>by delayed<br/>clock pulse<br/>to develop<br/>the source<br/>degeneration<br/>phenomena</li> <li>This arrange-<br/>ment is<br/>capable to<br/>reduce delay<br/>variability<br/>by using<br/>the source<br/>degeneration<br/>technique</li> </ul> | <ul> <li>It consumes<br/>high PDP due<br/>to ineffective<br/>controlling of<br/>keeper transis-<br/>tor</li> <li>Area and<br/>delay penalty<br/>because of<br/>extra PMOS<br/>keeper transis-<br/>tor</li> </ul> |
| [31] | Simple<br>Approach<br>to Reduce<br>Delay in<br>Domino<br>circuit<br>(SARDDC) | <ul> <li>etc.)</li> <li>First design<br/>to reduce<br/>its own<br/>transcon-<br/>ductance by<br/>using source<br/>degeneration<br/>technique</li> <li>Here, an<br/>extra PMOS<br/>is stacked<br/>in series<br/>with keeper<br/>transistor<br/>to reduce<br/>its own<br/>transconduct-<br/>ance so that<br/>process varia-</li> </ul> | <ul> <li>It has high<br/>switching PD<br/>due to the<br/>extra keeper<br/>transistor</li> <li>It has low NI<br/>for high Fan-in</li> </ul>                                                         | [34]                                                                      | High Speed<br>of Clock-<br>Delayed<br>Dual Keeper<br>Domino Cir-<br>cuit (HSCD-<br>DKDC)                                                                 | <ul> <li>A controlled<br/>delay element<br/>is added to<br/>enable the<br/>switching of<br/>keeper tran-<br/>sistor. This<br/>modification<br/>significantly<br/>improves the<br/>speed of the<br/>circuit</li> <li>Further,<br/>variation<br/>is reduced<br/>by effective<br/>sizing of<br/>the keeper<br/>network</li> </ul>                                         | <ul> <li>It is complex<br/>because of the<br/>two different<br/>power supply</li> <li>NI degrades<br/>for wide fan-in<br/>OR logic</li> </ul>                                                                    |
|      |                                                                              | tions in delay<br>and power<br>could be<br>brought down<br>effectively                                                                                                                                                                                                                                                               |                                                                                                                                                                                                    | at enhanc<br>delay and                                                    | his review it can b<br>ced NI in most of<br>l power are overce<br>ted problems have                                                                      | domino circui<br>ome in some c                                                                                                                                                                                                                                                                                                                                         | ts. Variations in of these circuits.                                                                                                                                                                             |
| [32] | Variation- and<br>Noise-Aware<br>Reliable<br>Dynamic<br>Circuit<br>(VNARDC)  | <ul> <li>The Schmitt<br/>Trigger<br/>topology is<br/>incorporated<br/>in this design</li> <li>This domino<br/>significantly<br/>mitigates the<br/>process varies</li> </ul>                                                                                                                                                          | <ul> <li>It lacks<br/>because of the<br/>area overhead</li> <li>This domino<br/>circuit is not<br/>capable to<br/>reduce PDP<br/>in the case of<br/>wide for in</li> </ul>                         | ously in a<br>is designe<br>using the<br>keeper tra<br>The pa<br>of Domin | single domino cir<br>ed to decrease the<br>bulk-driven techni<br>unsistor respectivel<br>oper is represented<br>to circuits and the<br>1 and 2. Proposed | cuit. Hence, thi<br>process variati<br>ique and efficie<br>ly at enhanced a<br>l as follows: a l<br>coretical backg                                                                                                                                                                                                                                                    | s proposed work<br>ons and PDP by<br>nt controlling of<br>NI.<br>literature review<br>round is defined                                                                                                           |

The paper is represented as follows: a literature review of Domino circuits and theoretical background is defined in Sects. 1 and 2. Proposed domino to decrease the delay and power variation, and PDP simultaneously at enhanced NI effectively is presented in Sect. 3. Simulation and comparison are done to confirm the performance upgradation of proposed domino other than existing domino in Sect. 4. Conclusion and applicability of this work is discussed in Sect. 5.

process varia-

tion in power and delay wide fan-in

### 2 Preliminary study

• Noise Immunity:

It is a very crucial task to measure NI in domino circuit to measure its reliability. Several design matrices have been developed to define NI. Unity Noise Gain (UNG) is considered for the measurement of NI [26–28]. The amplitude of the input source that causes the identical source at the output is equal to the (UNG) as given in Eq. (4).

$$UNG = \{ Vnoise : Vnoise = Vout \}$$
(4)

Here, noise pulse is taken very much like the actual noise pulse in terms of glitches, ground bounce, and crosstalk etc. Generally, duration and amplitude are varied to change the level of noise pulse. However, amplitude is considered in this work.

Figure of Merit:

Another important performance criterion used to assess the proposed domino circuit is the Figure of Merit (FOM) [19]. Which is represented by the ratio of UNG to the product of PD, delay, and area as given in Eq. (5). All components are normalized by CDC [4].

$$FOM = \frac{\text{UNG}_{\text{norm}}}{\text{PD}_{\text{norm}} \times Delay_{norm} \times Area_{norm}}$$
(5)

Temperature, frequency, supply voltage, and technology are other parameters, which may alter to FOM.

Process variation in Domino

To propose a wide fan-in domino circuit that decreases the delay and power variations, it is necessary to read out the concept of feedback factor for the loop formation by static INV and keeper transistor in CDC [4] as provided in Fig. 2. From [13] and [31], it is proved that delay and power variations in domino circuit are mainly because of feedback loop. Thus, mathematical expression of feedback factor is obtained to understand the method of reduction variations in domino circuit as given in Eqs. (6) and (7).

$$\mathbf{T} = (A_{Inv} * G_{mk} * Z_{DN}) \tag{6}$$

$$S = 1/(1 - T)$$
 (7)

where,  $A_{Inv}$ ,  $G_{mk}$ , and  $Z_{DN}$  represents the gain of static INV, keeper transistor's transconductance, and intrinsic load at the dynamic node respectively.

As in the starting of evaluation phase,  $A_{Inv}$  is less than one, because voltages at dynamic and output node is equal to 'V<sub>DD</sub>' and '0' V respectively [31]. And at the same time ,  $G_{mk}$  is also less than unity because keeper transistor is in linear region. Therefore, total loop gain (T) is found less



Fig. 2 Analysis of CDC [4] in current form

than unity as well as positive as given in Eq. (6). Equation (7) suggests the relation between closed loop gain sensitivity to variations as per the basics of control system [35]. S goes toward infinite as T approaches to unity, which means highly sensitive toward the variations. S goes toward unity as T approaches to zero, which means reduction in process variations. Therefore, it can be concluded that loop gain must decrease to overcome the delay and power variations.

Since loop gain is directly proportional to the  $A_{inv}$ ,  $G_{mk}$ , and  $Z_{DN}$ . Thus, loop gain can be lowered in same proportion by lowering any one of these components. As, it is very hard to control the  $A_{inv}$  and  $Z_{Dyn}$  due to their own limitation [33]. Thus,  $G_{mk}$  is reduced in the work to decrease the delay and power variations although other parameters like PD and delay etc. remain intact.

#### Bulk-Driven MOS

Since, keeper transistor is used in a bulk-driven configuration (keeper transistor operates by the bulk control rather the gate control). The operation of bulk driven MOSFET is like depletion type MOSFET [36]. Primarily, gate voltage ( $V_{DC}$ ) is fixed just below the threshold voltage of keeper transistor so that depletion layer is developed whereas drain current has not started to flow so far. A voltage is applied to the bulk (body) terminal of transistor to modulate drain current through transistor [36, 37]. Small signal schematic of keeper transistor using bulk-driven technique is shown in Fig. 3 and drain current is represented by Eqs. (8) and (9) respectively.



Fig. 3 Small signal schematic of Bulk-driven Keeper Transistor



Fig. 4 Schematic representation of proposed domino circuit

$$I_{d} = \frac{K'W}{L} \left( V_{gs} - V_{t0} - \gamma \left( \sqrt{2\phi_{f} - V_{bs}} + \sqrt{2\phi_{f}} \right) - \frac{n}{2} V_{ds} \right)$$
$$V_{ds}, V_{ds} < V_{ds}(sat)$$
(8)

$$I_d = \frac{K \cdot W}{L} \left( V_{gs} - V_{t0} - \gamma \left( \sqrt{2\phi_f - V_{bs}} + \sqrt{2\phi_f} \right) \right)^2, \quad V_{ds} > V_{ds}(sat)$$
(9)

$$G_{mbk} = \frac{\mathrm{d}I_d}{\mathrm{d}V_{bs}} = \frac{\gamma G_{mk}}{2\sqrt{2\phi_f - V_{bs}}} = \eta G_{mk} \tag{10}$$

where K',  $\gamma$ , and  $\phi_f$  represents transconductance parameter, substrate-bias (body-effect) coefficient, and fermi potential.  $G_{mbk}$  and  $G_{mk}$  are the transconductance of keeper transistor for bulk-driven technique and gate-driven technique respectively. However,  $\eta$  is the ratio of  $G_{mbk}$  to  $G_{mk}$ . Range of coefficient depends on the technology node, which is 0.2–0.4 for below 90 nm channel length [38]. Therefore, it can be said as per the Eq. (10) that transconductance of keeper under bulkdriven technique is less than gate-driven [39] technique.

#### **3** Proposed domino

The proposed domino circuit as depicted in Fig. 4 is designed using bulk driven technique. This technique minimizes significantly the delay and power variations and enhances the NI. However, PD is somewhat increased because of the developed depletion layer. To manage the switching behavior of the keeper transistor, an effective combination of transistors known as the keeper controlling network is implemented, which minimizes the PD. This arrangement improves the NI also.

The bulk driven technique provides low transconductance and no threshold voltage limitation over a gate-driven device. As per Eq. (10), transconductance of bulk-driven keeper is substantially reduced from 0.2 to 0.4 times than the gate-driven transconductance [36, 38] which consequently reduces loop gain factor (T) in same proportion. This reduction in loop gain factor significantly decreases the delay and power variations. Thus, reduced transconductance will boost the robustness of the domino circuit against the variations. Further, it is worth noting here that threshold voltage constraint vanishes, and transistor responds for both positive and negative bias voltages (V<sub>bs</sub>) just because of the developed depletion layer due to  $V_{DC}$  [37]. A small voltage (at the output node) provided to the body of the keeper transistor, is sufficient to turn it on and quickly charge the dynamic node to  $V_{dd}$ . In addition, a single keeper transistor having minimum size can keep dynamic node at truthful logic in the case of large fan-in. Furthermore, this minimum size is advantageous to keep the PD and variations minimum because transconductance of transistor is directly proportional to its own width. Now, dynamic node will fully charge and discharge properly which maximizes dynamic range of the domino circuit. Therefore, both characteristics of bulk driven i.e. low transconductance and no threshold voltage limitation are beneficial for domino circuit.

Furthermore, an extension of HSD [22] is incorporated for efficient controlling of keeper transistor in the proposed domino circuit in order to minimize PDP and enhance NI. In this arrangement (keeper controlling network), output node and supply voltage are connected to body terminal of keeper transistor through a Transmission Gate (TG) and PMOS transistor  $(M_{pc})$  respectively. Transistor  $M_{pc}$  is used to turn OFF the keeper transistor in precharge phase which significantly reduces dynamic PD. Moreover, TG is used in place of pass transistor in proposed domino because it transfers the full ' $V_{dd}$ ' and '0' voltage without any degradation [40] so that keeper transistor can be completely turned ON and OFF. Also, the lower ON resistance of TG reduces the overall delay of domino circuit effectively [40]. Thus, TG is used to transfer the exact output logic with high switching speed to the body of keeper transistor in evaluation phase,

which quickly turns ON and OFF the keeper transistor in order to fully charge and discharge the dynamic node.

In the case of wide fan-in OR logic, unwanted discharging of dynamic node is increased by subthreshold leakage current. Therefore, bulk-driven keeper transistor is used in evaluation phase to re-charge the dynamic node with an efficient controlling and high speed, whereas footer transistor is incorporated to avoid any leakage during precharge phase. Thus, both dynamic and output nodes are fully shielded by noise and unnecessary charge sharing. Consequently, NI is enhanced along with reduced PDP in the proposed domino.

Both phases of the proposed domino circuit's operation are described in detail below, similar to CDC [4];

Precharge Phase:

In this phase, clock (CLK) and  $\overline{CLK}$  are at logic '0' and '1' (V<sub>dd</sub>) and all given input are at logic '0'. Thus, all transistors in PDN including footer transistor are turned OFF, and transistors M<sub>pre</sub> turns ON which charges the dynamic node upto logic high which discharges the output node through low-skewed inverter. After certain delay, M<sub>pc</sub> and TG turn ON and OFF respectively resulting in a high voltage being applied to bulk of keeper transistor. Therefore, keeper transistor is turned OFF. Further, transistor M<sub>footer</sub> is turned OFF to avoid any unnecessary discharging of dynamic node. At the end of this phase, the logic of the dynamic and output nodes are '1' and '0,' respectively.

Evaluation Phase:

In the starting of this phase, clock (CLK) and *CLK* are at the logic '1' ( $V_{dd}$ ) and '0' consequently transistors  $M_{pre}$  and footer turn OFF and ON respectively. After certain delay,  $M_{pc}$  and TG are turned OFF and ON respectively. Since, TG is ON due to which output node is directly connected to body of keeper. Thus, keeper will be controlled by output node. Now, dynamic node may discharge or re-charge depending on the transistors of PDN. Therefore, next phase of evaluation phase will depend on applying inputs to transistors of PDN (NMOS connected in parallel), which may either be '1' or '0'.

Now, two case can occur. In first case, all inputs are at logic '0' resulting all parallel transistors in PDN to turn OFF which signifies logic high and low at dynamic and output nodes respectively. Thus, logic low will be applied to the body of keeper transistor because TG is ON. Due to which keeper starts to re-charge the dynamic node if any discharging occurs because of subthreshold leakage current. Hence, dynamic and output nodes will stay at logic high '1' and low '0' respectively.

In second case, at least one input should be at high logic resulting at least one PDN transistor to be turned ON. As a

result, dynamic nodes have access to at least one discharging path, and dynamic node begin to discharge through that way. As a result, the output capacitor begins to charge through a low skewed inverter. Further, output node becomes at logic '1' which completely turns OFF the keeper transistor. Finally, the logic '0' and '1' appear to the dynamic and output nodes, respectively. The output waveform is obtained by simulating the proposed domino to verify the 32-bit OR logic as given in Fig. 5.

#### 4 Results and comparison of simulation

All the circuits are designed and simulated under the same environment using SPECTRE simulator in Cadence Virtuoso tool to observe the performance parameters and verify the results so that a fair discussion can be done. Initially, temperature, power supply, frequency and output load are considered to be at room temperature, 1 V, 1Ghz, and 5fF during simulation respectively. These simulation values are varied to consider all the environment effects on the proposed circuits. The performance constraints of the proposed circuit are evaluated for different fan-in; 8, 16, 32, 64-input under worst case delay. A framework, which is setup by Alioto et al. [13] is taken to obtain all the performance specifications given in Fig. 6.

Average normalized PD and UNG for previously reported circuits and proposed domino circuit are attained at different fan-in as presented in Table 1. This table depicts that PD is reduced by 18–37% in the proposed domino with respect to CDC [4]. This table also illustrates that NI of the proposed domino is improved by 1.43–1.56 times as compared to the CDC [4]. Because of the efficient controlling of the keeper transistor PD has been reduced and improvement in NI is due to the bulk driven technique which vanishes the minimum voltage requirement. Moreover, dynamic node is efficiently charged and discharged because of the controlled keeper transistor. Therefore, proposed domino circuit shows reduction in PD and improved NI in contrast to previously reported domino circuits.

With the use of Eq. (5), the FOM values for all previously reported and proposed domino circuits are computed and illustrated at 32-bit fan-in by a bar-chart in Fig. 7. CDC [4] normalizes the value of FOM for each domino circuit to easily see the improvement in proposed domino circuits compared to previously reported domino circuits. Predicted area is simply obtained by transistor count of designed circuit, which is used to calculate the FOM. Since, area of PMOS is nearly twice than NMOS therefore single PMOS is taken on two NMOS count. With a comparison of FOM through Fig. 7, the proposed domino has a greater FOM than other existing reported domino circuits.

| Fan-in        | Existing domino | CDC<br>[4] | HSDC<br>[20] | LCRDC<br>[21] | LPDC<br>[23] | VTVKC<br>[29] | VTKDC<br>[1] | SARDDC<br>[31] | VNRDC<br>[32] | HSCDDKDC<br>[34] | Proposed domino |
|---------------|-----------------|------------|--------------|---------------|--------------|---------------|--------------|----------------|---------------|------------------|-----------------|
| 8-bit (60pS)  | Normalized PD   | 1.00       | 0.90         | 0.77          | 0.85         | 0.99          | 0.92         | 0.83           | 0.96          | 0.89             | 0.82            |
|               | Normalized UNG  | 1.00       | 1.17         | 1.07          | 1.30         | 1.27          | 1.10         | 1.20           | 1.27          | 1.23             | 1.43            |
| 16-bit (70pS) | Normalized PD   | 1.00       | 0.88         | 0.85          | 0.91         | 0.98          | 0.93         | 0.78           | 0.95          | 0.91             | 0.75            |
|               | Normalized UNG  | 1.00       | 1.19         | 1.07          | 1.30         | 1.26          | 1.15         | 1.19           | 1.33          | 1.26             | 1.44            |
| 32-bit (80pS) | Normalized PD   | 1.00       | 0.97         | 0.73          | 0.79         | 0.92          | 0.89         | 0.74           | 0.87          | 0.82             | 0.66            |
|               | Normalized UNG  | 1.00       | 1.17         | 1.04          | 1.30         | 1.35          | 1.13         | 1.17           | 1.35          | 1.26             | 1.48            |
| 64-bit (90pS) | Normalized PD   | 1.00       | 0.95         | 0.72          | 0.73         | 0.87          | 0.82         | 0.82           | 0.90          | 0.78             | 0.63            |
|               | Normalized UNG  | 1.00       | 1.11         | 1.00          | 1.28         | 1.28          | 1.06         | 1.11           | 1.33          | 1.22             | 1.56            |

 Table 1
 Comparative analysis of normalized values of PD and UNG at different fan-in among proposed domino and keeping similar delay



Fig. 5 Simulated output waveform for the proposed domino circuit at 32-bit fan-in OR gate



Fig. 6 Simulation setup used in this work

The most important task of the proposed domino must authenticate the circuit against process and environment variations. Therefore, proposed and previously reported domino circuits are simulated to find out the standard deviation (SD) of delay and PD at different fan-in to identify the variations reduction in delay and PD. All normalized values of SD are listed in Table 2 to compare proposed and previously cited domino. In this work, Monte Carlo simulation at 2000 iterations is performed to find out the effect of process variations with 2000 iterations so that highly accurate results can be obtained as in [31]. Table 2 shows 25% reduction in SD of PD and 31% reduction in SD of delay in proposed domino as compared to CDC [4]. Because of the keeper transistor's reduced transconductance, the delay and power variations have been reduced.

Effect of process, voltage, and temperature (PVT) variations must be included to substantiate the stability, robustness, and reliability. Thus, the proposed domino

along with some previously reported circuits are evaluated at different process corners, temperature, and voltages to take into account of the PVT effect. First, delay and power variability as well as SD are obtained at different corners, temperatures, and voltages given in Table 3, Figs. 8, 9 and 10. Table 3 describes that both delay and power variability



Fig. 7 Comparative analysis of the normalized FOM

are reduced at most of the corners in contrast to CDC [4] and SARDDC [31].

Further, proposed design along with CDC [4] for 32-bit fan-in is simulated at different voltage nodes to consider the voltage variation as given in Fig. 8. All the values of PD and delay in Fig. 8 are normalized by CDC [4]. Figure 8 shows that PD for both domino circuits has increased with respect to voltage whereas delay is reduced. Because, PD directly depends on the square of  $V_{DD}$ . Moreover, PD and delay of proposed domino circuit are less than the CDC [4] at each voltage node and less fluctuating with respect to voltage.

Furthermore, the proposed domino circuit along with CDC [4] is simulated at different temperatures. Thus, numerical values of delay and PD are obtained at different temperatures. Figures 9 and 10 are plotted using these values so that a comparison could done between proposed domino and CDC [4]. It can be seen through both the Figs. 9 and 10 that delay and PD are increasing linearly for proposed domino with temperature according to the basic property of temperature increment. Whereas, both PD and delay are linearly increasing but start to decrease above 100 °C in the case of CDC [4].

Thus, it can be said that the proposed domino is more consistent and tolerant of fluctuations brought on by process and environment. Hence, it is verified through all the results of PVT effect that robustness, stability and reliability of the

| Fan-in        | Existing domino  | CDC<br>[4] | HSDC<br>[20] | LCRD<br>[21] | LPDC<br>[23] | VTVKC<br>[29] | VTKDC<br>[1] | SARDDC<br>[31] | VNRDC<br>[32] | HSCDDKDC<br>[34] | Proposed domino |
|---------------|------------------|------------|--------------|--------------|--------------|---------------|--------------|----------------|---------------|------------------|-----------------|
| 8-bit (60pS)  | $\sigma_{PD}$    | 1.00       | 0.97         | 1.00         | 0.92         | 0.91          | 0.89         | 0.90           | 0.93          | 0.89             | 0.84            |
|               | $\sigma_{Delay}$ | 1.00       | 0.83         | 0.84         | 0.91         | 0.84          | 0.90         | 0.94           | 0.89          | 0.90             | 0.85            |
| 16-bit (70pS) | $\sigma_{PD}$    | 1.00       | 0.93         | 0.95         | 0.84         | 0.76          | 0.84         | 0.86           | 0.89          | 0.86             | 0.83            |
|               | $\sigma_{Delay}$ | 1.00       | 1.01         | 0.83         | 0.81         | 0.91          | 0.88         | 0.89           | 0.86          | 0.87             | 0.82            |
| 32-bit (80pS) | $\sigma_{PD}$    | 1.00       | 0.89         | 0.90         | 0.91         | 0.87          | 0.83         | 0.85           | 0.86          | 0.84             | 0.81            |
|               | $\sigma_{Delay}$ | 1.00       | 0.96         | 0.97         | 0.97         | 0.89          | 0.93         | 0.88           | 0.84          | 0.85             | 0.77            |
| 64-bit (90pS) | $\sigma_{PD}$    | 1.00       | 0.87         | 0.90         | 0.89         | 0.82          | 0.80         | 0.83           | 0.88          | 0.82             | 0.75            |
|               | $\sigma_{Delay}$ | 1.00       | 0.91         | 0.98         | 0.86         | 0.80          | 0.77         | 0.79           | 0.77          | 0.78             | 0.69            |

**Table 2** Comparative analysis of normalized ( $\sigma$ ) standard deviation in the delay and PD at different fan-in among proposed domino and previously reported domino keeping similar delay

| Table 3         Comparative analysis              |
|---------------------------------------------------|
| of ( $\sigma$ ) SD and ( $\sigma/\mu$ ) processes |
| variability in the delay and PD                   |
| among the proposed, SARDDC                        |
| [31] and CDC [4] at different                     |
| corners keeping similar delay                     |

| Load (fF) | CDC [4          | -]             |              |                | SARDDC [31]     |                |                         |                | Proposed domino |                |              |                |
|-----------|-----------------|----------------|--------------|----------------|-----------------|----------------|-------------------------|----------------|-----------------|----------------|--------------|----------------|
|           | PD              |                | Delay        |                | PD              |                | Delay                   |                | PD              |                | Delay        |                |
|           | $\sigma(\mu W)$ | $(\sigma/\mu)$ | $\sigma(pS)$ | $(\sigma/\mu)$ | $\sigma(\mu W)$ | $(\sigma/\mu)$ | $\overline{\sigma(pS)}$ | $(\sigma/\mu)$ | $\sigma(\mu W)$ | $(\sigma/\mu)$ | $\sigma(pS)$ | $(\sigma/\mu)$ |
| TT        | 0.651           | 0.041          | 8.748        | 0.111          | 0.542           | 0.048          | 5.678                   | 0.070          | 0.491           | 0.046          | 4.059        | 0.050          |
| FF        | 0.608           | 0.038          | 5.660        | 0.090          | 0.576           | 0.041          | 4.983                   | 0.069          | 0.568           | 0.044          | 3.552        | 0.047          |
| SS        | 0.588           | 0.038          | 11.330       | 0.107          | 0.302           | 0.029          | 7.256                   | 0.072          | 0.268           | 0.029          | 5.038        | 0.051          |
| FS        | 0.392           | 0.028          | 9.672        | 0.121          | 0.336           | 0.026          | 5.781                   | 0.075          | 0.295           | 0.031          | 3.231        | 0.043          |
| SF        | 0.950           | 0.051          | 12.100       | 0.108          | 0.885           | 0.065          | 9.816                   | 0.089          | 0.785           | 0.063          | 8.445        | 0.077          |



Fig. 8 Comparative analysis of voltage variation in delay and PD for the proposed domino and CDC [4]



Fig. 9 Comparative analysis of Temperature variation in PD for the proposed domino and CDC [4]



Fig. 10 Comparative analysis of Temperature variation in Delay for the proposed domino and CDC [4]

proposed domino are enhanced in contrast to the SARDDC [31] and CDC [4]. These enhancements have happened as a result of the scaling down in the transconductance of keeper transistor due to bulk-driven technique and efficient controlling of the keeper transistor.

## 5 Conclusion

The wide fan-in OR gate domino circuits suffer due to high process variations and subthreshold leakage current resulting degrading performances. Thus, a novel technique to decrease the delay and power variation, and PDP at improved noise immunity is presented and compared to existing domino. In which, keeper transistor is connected in bulk-driven technique to reduce the process variations by decreasing its own transconductance. Moreover, an architecture is designed for the efficient controlling of keeper so that proposed domino can offer minimal PDP and improved noise immunity. However, the use of extra transistor to design the architecture for the controlling of keeper increases total area of the novel keeper circuit. The novel keeper style minimises the delay and power variations by nearly 25% and 31%, respectively, as compared to the conventional keeper style when doing 2000 runs in Monte Carlo simulations of the ADE-XL environment. The simulations demonstrate that bulk-driven PMOS keeper transistor offers minimal variations in delay and power with good reliability and robustness compared to the gate-driven keeper.

#### References

- Dadgour HF, Banerjee K (2010) A novel variation-tolerant keeper architecture for high-performance low-power wide fan-in dynamic OR gates. IEEE Trans Very Large Scale Integr (VLSI) Syst 18(11):1567–1577
- Suzuki H, Kim CH, Roy K (2007) Fast tag comparator using diode partitioned domino for 64-bit microprocessors. IEEE Trans Gates Syst 54(2):322–328
- Krishnamurthy RK et al (2002) A 130-nm 6-GHz 256/spl times/32 bit leakage-tolerant register file. IEEE J Solid-State Gates 37(5):624–632
- Rabaey JM, Chandrakasan AP, Nikolic B (2002) Digital integrated gates, vol 2. Prentice hall, Englewood Cliffs
- Gronowski P (2001) Issues in dynamic logic design. In: Chandrakasan A, Bowhill WJ, Fox F (eds) Design of High-Performance Microprocessor Gates. IEEE Press, Piscataway, pp 140–157 (ch. 8)
- Divya, Mittal P (2022) A low-power high-performance voltage sense amplifier for static RAM and comparison with existing current/voltage sense amplifiers. Int J Inf Tecnol 14:1711–1718. https://doi.org/10.1007/s41870-022-00916-x
- Sharma S, Devasia R, Sharma G (2020) A novel low power and highly efficient inverter design. Int J Inf Technol 12(4):1111–1116

- 8. Croon JA, Sansen W, Maes HE (2005) Matching properties of deep sub-micron MOS transistors. Springer, New York
- Eisele M et al (1997) The impact of intra-die device parameter variations on path delays and on the design for yield of low voltage digital circuits. IEEE Trans Very Large Scale Integr (VLSI) Syst 5(4):360–368
- Alvandpour A, Larsson-Edefors P and Svensson C (1999) A leakage-tolerant multi-phase keeper for wide domino circuits. In: Electronics, Circuits and Systems, 1999. Proceedings of ICECS'99. The 6th IEEE International Conference on. Vol. 1. IEEE
- Mahor V, Pattanaik M (2015) Novel NBTI aware approach for low power FinFET based wide fan-in domino logic. J Low Power Electron 11(2):225–235
- Srivastava A, Sylvester D, Blaauw D (2006) Statistical analysis and optimization for VLSI: timing and power. Springer Science & Business Media
- Alioto M, Palumbo G, Pennisi M (2010) Understanding the effect of process variations on the delay of static and domino logic. IEEE Trans Very Large Scale Integr (VLSI) Syst 18(5):697–710
- 14. Alvandpour A et al (2002) A sub-130-nm conditional keeper technique. IEEE J Solid-State Circs 37(5):633–638
- Mahmoodi-Meimand H, Roy K (2004) Diode-footed domino: a leakage-tolerant high fan-in dynamic circuit design style. IEEE Trans Circ Syst I Regul Pap 51(3):495–503
- Jeyasingh et al (2011) Adaptive keeper design for dynamic logic gates using rate sensing technique. IEEE Trans Very Large Scale Integr (VLSI) Syst 19(2):295–304
- Peiravi A, Asyaei M (2012) Robust low leakage controlled keeper by current-comparison domino for wide fan-in gates. Integr VLSI J 45(1):22–32
- Peiravi A, Asyaei M (2013) Current-comparison-based domino: New low-leakage high-speed domino circuit for wide fan-in gates. IEEE Trans Very Large Scale Integr (VLSI) Syst 21(5):934–943
- Kumar A, Nagaria RK (2018) A new leakage-tolerant high speed comparator based domino gate for wide fan-in OR logic for low power VLSI circuits. Integration 63:174–184
- Anis MH, Allam MW, Elmasry MI (2002) Energy-efficient noisetolerant dynamic styles for scaled-down CMOS and MTCMOS technologies. IEEE Trans Very Large Scale Integr (VLSI) Syst 10(2):71–78
- Lih Y, Tzartzanis N, Walker WW (2007) A leakage current replica keeper for dynamic gates. IEEE J Solid-State Gates 42(1):48–55
- Asyaei M (2015) A new leakage-tolerant domino circuit using voltage-comparison for wide fan-in gates in deep sub-micron technology. Integr VLSI J 51:61–71
- Asyaei M, Ebrahimi E (2018) Low power dynamic circuit for power efficient bit lines. AEU-Int J Electron Commun 83:204–212
- Frustaci F et al (2008) High-performance noise-tolerant circuit techniques for CMOS dynamic logic. IET Circ Device Syst 2(6):537–548
- Gong Na, Wang J, Sridhar R (2014) Variation aware sleep vector selection in dual Vt dynamic or circuits for low leakage register file design. IEEE Trans Circ Syst 61(7):1970–1983
- Frustaci F et al (2014) Analyzing noise robustness of wide fan-in dynamic logic gates under process variations. Int J Circ Theory Appl 42(5):452–467

- Patnaik S, Hari U, Ahuja M, Narang S (2015) A modified variation-tolerant keeper architecture for evaluation contention & leakage current minimization for wide fan-in domino structures. In: Proceedings of the 2015 international conference on circuits, power and computing technologies (ICCPCT-2015), 2015, pp 1–7. https://doi.org/10.1109/ICCPCT.2015.7159348
- Padhi S, Angeline AA, Bhaaskaran VSK (2017) Design of process variation tolerant domino logic keeper architecture. In: Proceedings of the 2017 international conference on nextgen electronic technologies: silicon to software (ICNETS2), pp 301–308. https:// doi.org/10.1109/ICNETS2.2017.8067951
- Kursun V, Friedman EG (2003) Domino logic with variable threshold voltage keeper. IEEE Trans Very Large Scale Integr (VLSI) Syst 11(6):1080–1093
- Kim CH, Hsu S, Krishnamurthy R, Borkar S, Roy K (2005) Self calibrating circuit design for variation tolerant VLSI systems. In: Proceedings of the 11th IEEE international on-line testing symposium, 2005, pp 100–105. https://doi.org/10.1109/IOLTS.2005. 63
- Palumbo G, Pennisi M, Alioto M (2012) A simple circuit approach to reduce delay variations in domino logic gates. IEEE Trans Circ Syst I Regul Pap 59(10):2292–2300
- Pal I, Islam A (2018) Circuit-level technique to design variationand noise-aware reliable dynamic logic gates. IEEE Trans Device Mater Reliab 18(2):224–239
- Angeline AA, Kanchana Bhaaskaran VS (2019) High speed wide fan-in designs using clock controlled dual keeper domino logic circuits. ETRI J 41(3):383–395
- Angeline AA, Kanchana Bhaaskaran VS (2020) Speed enhancement techniques for clock-delayed dual keeper domino logic style. Int J Electron 107(8):1239–1253
- 35. Kuo BC (1987) Automatic control systems. Prentice Hall PTR
- Blalock BJ, Allen PE (1995) A low-voltage, bulk-driven MOS-FET current mirror for CMOS technology. In: Proceedings of ISCAS'95—International symposium on circuits and systems, vol 3, pp 1972–1975. https://doi.org/10.1109/ISCAS.1995.523807
- Durgam R, Tamil S, Raj N (2022) Low voltage high gain flipped voltage follower based operational transconductance amplifier. Int J Inf Technol 14(3):1643–1648
- Dubey AK, Nagaria RK (2018) Enhanced gain low-power CMOS amplifiers: a novel design approach using bulk-driven load and introduction to GACOBA technique. J Circ Syst Comput 27:1850204
- Singh V et al (2022) A common-gate cascaded with cascoded self-bias common source approach for 3.1–10.6 GHz UWB low noise amplifier. Int J Inf Technol 14(5):2389–2398
- Ovens K, Bittlestone C, Helmick B (1995) Transmission gate circuit. U.S. Patent No. 5,430,408

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.